MICROCOPY  RESOLUTION  TEST  CHART 

NATIONAL  BURLAU  OF  STANDARDS  \ 963 -A 


(fp)  KAE-TAb  PS-  ii f 


/  THE  RECORDING  AND  PRELIMINARY  ANALYSIS  OF  A  DATA  BASE  FOR  THE  ASSESSMENT  OF  ’STRAIN* 

/  ^  ^  ^  ^  ^  ^  - 

I  IN  AIR  TRAFFIC  CONTROLLERS ,  USING  SPEECH# 


of  high  and  low  activity  was  tape  recorded  during  the  Farnborough  International 
Airshow  in  1978.  A  description  of  the  data  base  is  given  together  with  the  acti¬ 


vity  measures  used.  The  data  base  was  obtained  to  provide  a  means  of  testing 
the  hypothesis  that  the  speech  signal  can  be  used  to  assess  ’strain*  or  the 
effects  of  increasing  ’stress’  in  work.  Preliminary  statistical  analysis  of  the 
voice  ’pitch’  of  one  of  the  controllers  has  shown  that  periods  of  high  and  low 
activity  may  be  readily  discriminated  using  several  20  second  segments  of  voiced 
speech. 

/V 


Copyright 

© 

Controller  HMSO  London 
1980 


}iok~o 


2 


LIST  OF  CONTENTS 

1  INTRODUCTION 

2  DESCRIPTION  OF  THE  DATA  BASE 

3  ACTIVITY  MEASURES 

4  VOICE  PITCH  ANALYSIS  -  PRELIMINARY  RESULTS 

5  CONCLUSIONS 
Acknowledgments 
References 
Illustrations 

Report  documentation  page 


Page 

3 

3 

4 

5 
7 

7 

8 

Figures  1-4 
inside  back  cover 


FS  334 


FS  334 


3 


1  INTRODUCTION 

The  potential  use  of  speech  in  the  assessment  of  'strain1  in  a  man  at  work 
has  been  discussed  in  some  detail  by  the  author  in  Ref  I.  The  reader  is  referred 
to  this  Report  for  a  definition  of  terms  used  by  the  author  {eg  'stress'  and 
'strain')  and  for  background  in  the  characteristics  of  speech.  Ideally  the 
present  report  should  be  read  in  conjunction  with  Ref  1.  One  of  the  potentially 
most  useful  parameters  of  speech  for  assessing  'strain'  is  the  voice  fundamental 
frequency  (referred  to  hereafter  as  'pitch').  In  order  to  investigate  and 
quantify  any  changes  in  voice  'pitch'  characteristics  due  to  increasing  mental 
workload,  which  may  result  in  increased  'strain',  a  suitable  data  base  was 
required.  Such  a  data  base  would  contain  sufficient  speech  of  a  number  of  sub¬ 
jects  under  varying  workloads  to  enable  meaningful  statistical  measures  of  'pitch' 
changes  to  be  estimated  {eg  mean  and  standard  deviation).  Air  Traffic  Controllers 
(ATCs)  would  seem  to  be  ideal  candidates  for  providing  such  a  data  base  since 
their  task  requires  them  to  communicate  by  speech  and  may  at  times  be  very 
demanding.  It  was  decided  that  the  Farnborough  International  Air show  (1978) 
would  be  a  suitable  venue  for  recording  ATCs  at  work  where  it  was  envisaged  that 
periods  of  low  and  very  high  activity  could  be  recorded. 

In  order  to  enable  changes  of  the  voice  'pitch'  characteristics  to  be 
correlated  with  workload  level  or  task  difficulty,  an  independent  activity 
measure  was  used.  The  purpose  of  this  report  is  to  describe  the  data  base  and 
the  activity  measure  used,  and  to  present  the  preliminary  results  of  'pitch' 
analysis  on  one  of  the  ATCs  recorded. 

2  DESCRIPTION  OF  THE  DATA  BASE 

ATCs  at  two  radar  and  the  tower  positions  were  recorded  over  a  period  of 
four  days  from  4-7  September.  Recordings  were  made  on  some  of  the  mornings  and 
on  all  afternoons  commencing  at  around  1700  hours,  after  the  flying  display. 

Most  controllers  varied  their  position  from  day  to  day,  some  positions  being 
busier  than  others.  The  busiest  position  was  expected  to  be  the  tower  immedi¬ 
ately  after  the  official  flying  display  when  demonstration  flights  would  be 
required  and  VIPs  would  be  leaving  Farnborough  and  Blackbushe  Airport  by 
helicopter  and  light  plane.  Recordings  were  made  by  tapping  into  the  micro¬ 
phone  and  incoming  RT  lines  outside  the  controller’s  positions  to  avoid 
the  distracting  presence  of  other  staff.  NAGRA  4SJ  portable  tape  recorders 
were  used  at  a  tape  speed  of  lg  inches  per  second.  The  frequency  response 
at  this  speed  was  considered  adequate  for  subsequent  'pitch'  tracking. 

A  total  of  ten  ATCs  were  recorded  over  the  four  day  period,  providing  nearly 


4 


23  hours  of  material.  However,  because  the  rota  for  each  ATC  position  was  out¬ 
side  the  control  of  the  team  making  the  recordings,  only  one  ATC  produced  a 
sufficiently  large  and  suitable  data  base  representing  periods  of  high  and  low 
activity.  This  was  disappointing  since  it  was  hoped  that  several  controllers 
could  be  used  in  the  following  studies  to  give  greater  significance  to  the 
results.  The  controller  used  in  the  study  invariably  occupied  the  Tower  position 
which  proved,  as  expected,  to  be  the  busiest  of  all  those  recorded,  during  the 
late  afternoon  after  the  flying  display.  Nearly  8  hours  of  recordings  were 
made  of  this  controller.  This  controller  was  used  for  all  the  ’pitch’  analysis 
described  in  section  4. 

3  ACTIVITY  MEASURES 

In  order  to  attempt  to  correlate  overall  changes  in  the  ’pitch’  of  the 
ATC’s  voice  with  varying  work  levels  or  task  demands  some  independent  measure 
of  the  work  level  was  required.  Through  consultation  with  the  Air  Traffic 
Control  Evaluation  Unit  (ATCEU)  at  Hum  Airport  the  activity  measure  described 
below  was  adopted.  ATCEU  have  found  this  measure  to  be  a  useful  method  of 
monitoring  and  quantifying  the  ATC’s  workload. 

The  activity  measure  is  simply  a  record  of  the  number  of  RT  communications 
between  the  controller  and  the  aircraft,  the  number  of  telephone  communica tions 
and  the  number  of  Direct  Voice  Liaison  (DVL)  messages  between  controllers  in 
successive  5  minute  periods.  These  records  can  then  be  tabulated  and  an  activity 
histogram  drawn  up  with  the  number  of  aircraft  movements  in  each  5  minute  period 
(see  Fig  1).  The  number  of  messages  was  monitored  either  in  real  time  using 
coding  sheets  (Fig  2)  or  subsequently  from  the  tape  recording.  On  the  afternoon 
and  early  evening  of  6  September  a  total  of  109  aircraft  movements  were  recorded 
during  a  90  minute  period,  a  particularly  high  figure  for  ATC  operations. 

These  activity  measures  have  the  recognised  weakness,  however,  of  being 
misleading  in  the  case  where  a  controller  may  not  be  controlling  a  large  number 
of  aircraft  at  a  particular  time,  but  has  a  difficult  problem  with  one  or  more 
aircraft.  In  this  case  the  record  of  the  number  of  messages  may  be  low  but  the 
controller  may  be  under  a  high  mental  workload.  Caution  must  therefore  be 
exercised  in  interpreting  these  activity  measures.  It  is  to  be  expected  that 
speech  analysis,  and  perhaps  ’overall  pitch’  changes  in  particular,  may  provide 
a  useful  measure  of  the  ’strain’  due  to  the  workload  in  this  situation  where  the 
activity  measure  on  its  own  would  fail. 


lL 


FS  334 


FS  334 


5 


4  VOICE  PITCH  ANALYSIS  -  PRELIMINARY  RESULTS 

The  reader  is  referred  to  Ref  1  for  a  discussion  on  'pitch1  tracking  and 
methods.  It  is  not  the  purpose  of  this  report  to  describe  or  discuss  the  tech¬ 
niques  which  can  be  used  for  pitch  tracking.  The  rationale  behind  the  use  of 
voice  pitch  for  the  assessment  of  'strain',  or  the  effects  of  varying  mental 
workloads  is  also  discussed  in  Ref  1 .  In  order  to  produce  statistical  parameters 

of  the  voice  'pitch',  such  as  mean  and  standard  deviation,  at  least  20  seconds  of 

2 

voiced  speech  is  required  .  Since  speech  is  not  composed  entirely  of  voiced 
sounds  (where  'pitch'  is  present)  and  the  controller  is  not  speaking  continuously, 
several  minutes  or  more  of  recording  may  be  required  to  produce  a  single  estimate 
of  the  mean  and  standard  deviation  of  the  voice  'pitch'.  Before  any  significance 
can  be  attached  to  differences  in  the  means  or  standard  deviations  of  various 
estimates,  ideally  a  large  number  of  estimates  are  required  -  hence  the  require¬ 
ment  for  a  large  data  base,  preferably  with  many  speakers.  Unfortunately  as 
mentioned  previously  the  requirements  were  not  met  for  more  than  one  controller 
in  this  exercise  and  only  one  controller  could  be  selected  to  test  the  potential 
of  'pitch'  tracking  and  analysis  in  assessing  'strain'  or  the  effects  of  workload. 

The  preliminary  analysis  was  carried  out  using  the  facilities  of  the  Joint 
Speech  Research  Unit  (JSRU),  Cheltenham  since  the  author's  'pitch'  tracker*  had 
not,  at  the  time,  been  interfaced  to  a  computer  facility.  A  Cepstrum  processor 
(Ref  1,  p  12)  interfaced  to  a  PDP11/40  computer  was  used  to  process  the  tapes. 
Before  processing  recordings  made  at  the  Tower  position  on  6  and  7  September,  the 
tapes  were  edited  by  hand  to  remove  any  unwanted  speech  from  other  controllers 
or  pilots  speaking  to  the  Tower  controller.  The  material  was  divided  into  two 
categories,  periods  of  low  and  high  activity,  as  assessed  from  the  activity 
measures  described  in  section  3.  Recordings  made  on  the  morning  of  7  September 
provided  the  low  activity  material  and  recordings  made  in  the  afternoons  of  the 
6  and  7  September,  the  high  activity  material  (Figs  1  and  3).  The  'pitch'  from 
as  many  20  second  segments  of  voiced  speech  as  possible  was  extracted  from  the 
recordings  using  the  Cepstrum  processor.  Data  for  each  segment  were  stored  and 
labelled  on  the  computer  for  subsequent  statistical  analysis. 

Various  statistical  parameters  were  computed  for  each  segment  but  it  was 
decided  beforehand  that  only  the  mean  and  mean  absolute  deviation  would  be  used 
initially  since  previous  work  (cited  in  Ref  1)  has  indicated  that  these  parameters 
are  likely  to  be  the  most  significant  in  identifying  changes  in  'pitch'  due  to 
'strain'.  Hacker  et  al  for  example,  found  an  overall  increase  in  mean  'pitch* 


6 


and  a  decrease  in  ' pitch '  range  under  'task  induced  stress'.  From  the  physiologi¬ 
cal  standpoint  there  is  evidence  to  suggest  that  an  increase  in  laryngeal  muscle 

tension  and/or  an  increase  in  sub-glottal  pressure  (pressure  below  the  vocal 

4  5 

cords)  gives  rise  to  changes  in  'pitch'  *  .  Such  physiological  changes  may  well 
occur  under  certain  levels  of  task  induced  'stress'  or  mental  workload. 

The  mean  absolute  deviation  was  used  rather  than  standard  deviation  since 
it  is  less  sensitive  to  outlying  points  (erroneous  'pitch'  values  produced  by  the 
Cepstrum  piocessor,  eg  'pitch'  doubling). 

A  scatter  plot  of  the  mean  'pitch'  versus  the  mean  absolute  deviation  for 

all  the  segments,  representing  both  high  and  low  activity,  is  shown  in  Fig  4. 

No  suitable  statistical  measure  was  found  to  demonstrate  the  significance. 

However,  it  is  fairly  clear  that  two  distinct  clusters  exist  in  the  scatter  plot 

enabling  the  periods  of  high  and  low  activity  to  be  discriminated.  The  data 

* 

appears  to  be  highly  uncorrelated  in  the  mean  'pitch'  axis  with  only  two  points 
from  the  total  of  21  overlapping.  The  overall  mean  and  mean  absolute  deviation 
for  both  classes  turns  out  to  be  159  Hz  with  a  mean  absolute  deviation  of  25.8  Hz 
for  the  low  activity  class  and  174  Hz  with  a  mean  absolute  deviation  of  20.6  Hz 
for  the  high  activity  class. 

3 

These  results  agree  with  the  findings  of  Hecker  et  at  in  their  experiments 
on  the  'manifestations  of  task  induced  stress  in  the  acoustic  speech  signal'. 
Several  subjects  investigated  by  this  group  showed  an  overall  increase  in  'pitch' 
and  a  tendency  to  speak  in  more  of  a  monotone  ( ie  a  decrease  in  mean  absolute 
deviation)  when  they  were  under  task  induced  stress. 

Other  work  carried  out  by  Williams  and  Stevens  (both  members  of  the  team 
involved  in  the  previous  experiments)  on  the  effects  of  emotion  on  speech, 
particularly  of  pilots  in  flight,  has  shown  an  increase  in  median  'pitch'  and 
'pitch'  range  for  several  seconds  of  speech  recorded  in  situations  of  fear**^. 

It  is  postulated  that  this  may  be  a  result  of  a  lack  of  motor  control  and  possibly 
tremor**.  The  fact  that  in  this  instance  the  'pitch'  range  has  increased  rather 
than  decreased  as  in  the  case  of  'task  induced  stress',  illustrates  the  need  for 
caution  in  relating  work  done  on  changes  in  speech  under  different  kinds  of  emo¬ 
tion  to  possible  changes  which  may  occur  under  mental  workload  or  'task  induced 
stress'.  It  should  be  said  also  that  the  observations  made  by  Williams  and 
Stevens**  were  made  in  situations  of  extreme  emotion  prior  to  fatal  air  crashes 
and  the  'pitch'  changes  were  larger  than  those  reported  in  this  study.  It  could 
however  be  speculated  that  the  large  variations  in  'pitch'  range  observed  in 


tb  J34 


7 


situations  of  fear  and  anxiety  may  be  observed  in  high  levels  of  mental  workload 
or  'task  induced  stress'  where  the  subject  may  lose  complete  control  of  the 
situation. 

3 

The  effects  on  'pitch'  reported  in  this  study  and  that  of  Hecker  et  at 
may  represent  an  intermediate  stage  where  a  subject  is  'stressed'  but  in 
control  of  the  situation.  It  is  possible,  that  if  this  were  true,  measurement 
of  'pitch'  characteristics  may  provide  a  useful  tool  for  investigating  the  limit 

of  acceptable  workload. 

V 

In  the  present  study  a  number  of  other  speakers  would  be  required  to 
establish  statistically  the  significance  of  these  observations.  However,  the 
observations  outlined  above  are  sufficiently  encouraging  to  suggest  that  voice 
'pitch'  analysis  may  provide  a  useful  method,  either  on  its  own  or  combined  with 
other  methods,  of  assessing  mental  workload  and  'strain'  in  ATCs  or  other  sub¬ 
jects  whose  task  requires  speech  communication. 

5  CONCLUSIONS 

A  data  base  of  ATC  communications  during  periods  of  low  and  high  controller 
activity  has  been  collected.  The  exercise  has  illustrated  the  problems  of 
obtaining  a  large  data  base  using  a  number  of  speakers  in  an  operational  environ¬ 
ment.  Preliminary  analysis  of  the  voice  pitch  of  one  ATC  occupying  the  Tower 
position  over  two  consecutive  days  has  shown  that  periods  of  high  and  low  acti¬ 
vity,  as  measured  by  an  independent  activity  measure,  can  be  distinguished  by 
changes  in  mean  voice  'pitch'  and  mean  absolute  deviation.  Further  analysis 
with  more  subjects  needs  to  be  carried  out  to  establish  the  consistency  with 
which  voice  pitch  changes  may  be  used  to  assess  mental  workload  and  'strain'. 

The  small  study  reported  here,  has,  however  revealed  the  potential  of  speech 
analysis  and  'pitch'  changes  in  particular  for  assessing  mental  workload  and 
'strain' . 

Acknowledgments 

The  author  would  like  to  thank  a  number  of  people  for  their  help  and  active 
participation  in  the  work  described  in  this  Memorandum  -  the  Senior  Air  Traffic 
Control  Officer  and  all  the  Controllers  who  cooperated  in  the  recordings  made 
during  the  Farnborough  Air  Show;  Mr  H.  Howells  and  Mr  C.  Ellis  (FS4)  for  their 
help  in  making  the  recordings  and  monitoring  the  controllers'  activities;  the 
ATCEU  for  their  advice  on  activity  measures;  Mr  J.  Holmes  of  the  JSRU  for  pro¬ 
viding  their  facilities  for  the  speech  analysis  and  Mr  J.  Bridle  for  his  help 
and  advice  in  carrying  out  the  analysis. 


REFERENCES 


Author 

J.B.  Peckham 


Y.  Horu 

M.  Hecker 

et  al 

P.  Ladefoged 
P.  McKinney 


J.E.  Atkinson 


C.E.  Williams 
K.N.  Stevens 

C.E.  Williams 
K.N.  Stevens 


Title,  etc 

A  device  for  tracking  the  fundamental  frequency  of 
speech  and  its  application  in  the  assessment  of  'strain' 
in  pilots  and  air  traffic  controllers. 

RAE  Technical  Report  79056  0  979) 

Some  statistical  characteristics  of  voice  frequency. 

J.  Speech  and  Hearing  Research,  18,  192-201  (1975) 

Manifestations  of  task  induced  stress  in  the  acoustic 
speech  signal. 

J.  Acoust.  Soc.  Am.,  Vol  44,  4,  993-1001  (1968) 

Loudness,  sound  pressure,  and  subglottal  pressure  in 
speech. 

J.  Acoust.  Soc.  Am.,  Vol  35,  4,  454-460  (1963) 

Correlation  analysis  of  the  physiological  factors 
controlling  fundamental  voice  frequency. 

J.  Acoust.  Soc.  Am.,  Vol  63,  1,  211-222  (1978) 

On  determining  the  emotional  state  of  pilots  during 
flight:  an  exploratory  study. 

Aerospace  Medicine,  1369-1372,  December  1969 

Emotions  and  speech:  some  acoustical  correlates. 

J.  Acoust.  Soc.  Am.,  Vol  52,  4,  (part  2),  1238-1250  (1972) 


REPORTS  QUOTED  ARE  NOT  NECESSARILY 
u«.  II  ABLE  TO  MEMBERS  OF  THE  PUBLIC 
OR  TO  COMMERCIAL  ORGANISATIONS 


1120  1125 


mean  absolute  deviation 


